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ABSTRACT 



Closely related processing threads within a process in a 
multiprocessor system are collected into thread groi4}s 
which are globally scheduled as a group based on the thread 
group structure's priority and scheduling parameters. The 
thread groiip structure maintains collective tlmeslice and 
CPU accounting for all threads in the group. Within each 
thread group, each individual thread has a local scheduling 
priority for scheduling among the threads in its group. The 
system utilizes a hierarchy of processing Icvds and run 
queues to facilitate affining thread groups with ]Ht>cessors or 
groups of processors when possible. The system will tend to 
balance out die workload among system processors atui will 
migrate threads groups up and down through processing 
levels to increase cache hits and overall performance. The 
system is periodically reset to avoid long term unbalanced 
operation conditions. 

32 Claims, 9 Drawing Sheets 



241^ □ 



2*3-' 



21 2v^^- 



213^ 



21 K 



RT/rrcs 

BOO 



in 

TS/T 
♦05 

TS/T 
400 



RT/TOS 
j 810 

001 i 

m/rl \ 
eoo ! 



TS/TGS 
500 



^231 

i 



RT/r 
810 

zc 

7SA 
420 



'2J3 



PROCESS 200 



03/17/2004, EAST Version: 1.4.1 



U.S. Patent 



Apr. 28, 1998 



Sheet 1 of 9 



5,745,778 



to 

(N 




rv 



o 

04 




eg 





£2 



03/17/2004, EAST Version: 1.4.1 



U.S. Patent Apr. 28, im sheet 2 of 9 5,745,778 



240-s 



^\ 



241 



J 



242 



243 



J 



I I 



210';^ 



._-220?A„. 



^230-^^ 



211, 



212v^>- 



213^/1- 



214^> 



TSA 
405 



TSA 
405 



TSA 
400 



I I 



222 



I 1 



RTATGS 




RTAC3S 




TSAGS 


900 




810 




500 



JT 



231 



RTA 




RTA 


801 




810 




,^2231 j 




RTA 




TSA 


800 




420 



232 



I 



PROCESS 200 



FIG. 2 



03/17/2004, EAST Version: 1.4.1 



U.S. Patent Apr. 28, 1998 



Sheet 3 of 9 



5,745,778 




03/17/2004, EAST Version: 1.4.1 



U.S. Patent 



Apr. 28, 1998 



Sheet 4 of 9 



5,745,778 



211 



401 



THREAD 
GROUP 
TABLE 



0 




1 




2 





402 



THREAD 
TABLE 



0 




1 




2 




3 




4 




5 




6 






FIG. 4 



03/17/2004, EAST Version: 1.4.1 



U.S. Patent 



Apr. 28, 1998 



Sheet 5 of 9 



5,745,778 



501 



CPU AVAILABLE 



502 



CHECK CPU'S LEVEL 0 
RUN QUEUE,LEVEL 1 RUN 
QUEUE AND LEVEL 2 
RUN QUEUE 



503 



504 




YES 



■V SEQUENTIALLY CHECK 
\ LEVEL 0 RUN QUEUES OF 
^ OTHER CPUS IN SAME 
CPU GROUP 



505 



506 




YES 



CHECK RUN QUEUE OF 
OTHER LEVEL I 

INSTANCE 



507 



508 




YES 



SEQUENTIALLY CHECK 
LEVEL 0 RUN QUEUES OF 
CPUS IN OTHER 
CPU GROUP 



■0 



<!) 




FIG. 5A 



03/17/2004, EAST Version: 1.4.1 



U.S. Patent Apr. 28, 1998 sheet 6Qlt9 



, f 



520 



521- 




SELECT TG FROM 
RUN QUEUES BY 
ROUND ROBIN 
SELECTION 




MOVE TG TO 
SELECTING CPU'S 
LEVEL 0 RUN QUEUE 



EXECUTE HIGHEST 
PRIORITY THREAD IN TG'S 
LOCAL RUN QUEUE 



527 



RETURN 



FIG. 5B 



03/17/2004, EAST Version: 1.4.1 



U.S. Patent 



Apr. 28, 1998 



Sheet 7 of 9 



5,745,778 



540- 



1 



SELECT HIGHEST 
\lPRIORrrY TG FROM 
OTHER CPU'S 
LEVEL 0 
RUN QUEUE 



541 -> 



OTHER CPU'S 
CURRENTLY 
EXECUTING THREAD 
IN SELECTED 
TG? 



542 



1 



YES 



MOVE TG TO 

LEVEL 1 
RUN QUEUE 



543 



1 



EXECUTE HIGHEST 
PRIORITY THREAD 
IN TG'S LOCAL 
RUN QUEUE 



I 



545-\ RETURN 



NO 



544 



1 



MOVE TG TO 
SELECTING CPU'S 
LEVEL 0 

RUN QUEUE 



FIG. 5C 



03/17/2004, EAST Version: 1.4.1 



U.S. Patent Apr. 28, ms sheet 8 of 9 5,745,778 



550 



1 



SELECT HIGHEST 
PRIORITY TG FROM 
OTHER LEVEL 
1 RUN QUEUE 





MOVE TG TO 

LEVEL 2 
RUN QUEUE 



553 



"1 



557 



YES 



556 



1 



MOVE TG TO 
SELECTING CPU'S 
LEVEL 1 
RUN QUEUE 



MOVE T6 TO 
SELECTING CPU'S 
LEVEL 0 

RUN QUEUE 



EXECUTE HIGHEST 
PRIORITY THREAD 
IN TG'S LOCAL 
RUN QUEUE 



554-N. RETURN 



FIG. 5D 



03/17/2004, EAST Version: 1.4,1 



U.S. Patent 



Apr. 28, 1998 



Sheet 9 of 9 



5,745,778 



560 



1 



SELECT HIGHEST 
PRIORTTY TG FROM 
OTHER LEVEL 0 
RUN QUEUE 




OTHER CPU 
CURRENTLY 
EXECUTING THREAD 
IN SELECTED 
TG? 



NO 



562 



YES 



MOVE TG TO 

LEVEL 2 
RUN QUEUE 



563 



1 



i 



EXECUTE HIGHEST 
PRIORITY THREAD 
IN TG'S LOCAL 
RUN QUEUE 



I 



564-\ RETURN 



565 



1 



MOVE TG TO 
SELECTING CPU'S 
LEVEL 0 

RUN QUEUE 



FIG. 5E 



03/17/2004, EAST version: 1.4,1 



5,745,778 

1 2 
APPARATUS AND METHOD FOR » is yet another feature of the invention ftat a pluraUty of 

IMPROVED CPU AFFINITY IN A run queues arc ciiq)loycd. 

MULTIPROCESSOR SYSTEM It is yet a fiuthcr feature of the invention that thread 

groups can be dynanucally moved to different run queues 
^ during system operation. 

BACKGROUND OF THE D^VENOON ^ ^^^^ i^^ntion that cache affinity is 

1. Field of &c Invention inq)roved. 

The present invention relates generally to coniputex opcr- ft is another advantage erf the invention that die processor 
ating systems and more particularly to a method for dynami- work load balancing is improved, 
cally adjusting the aflOnity between CPUs and processing It is a further advantage that the local qkxation perfor- 
dueads in a multiprocessor system. mance is improved and intcigroup processing overhead is 

2. Description erf the Prior Art reduced. 

Threads are programming constructs that facilitate effi- Other features and advantages of the present invention 
dent control of numerous asynchronous tasks. Since they 13 will be understood by those of ordinary skill in the art after 
closely map to the underlying hardware, threads provide a referring to the detailed description of the preferred embodi- 
popuiar programming model fa applications running on meat and drawings. 



symmetric multiprocessing systems, 
As standard thread interfaces, such as the POSIX 



BRIEF DESCRIPTION OF THE DRAWINGS 



P1003.4a portable operating systems progranuning standard 20 pjQ i ^ overview of a multiprocessor data processing 

propagated by the Technical Conuoittce on Operating Sys- system. 

terns of the IEEE Computer Society, become more common, j ^^^^ -^^^^^^ organization of a process 

an increasing number of portable applications employing executing on system 150. 

threads arc being written andmore operating system vendors ^ . . . 4i,«„^. ♦u™^ 

•J* J — * rn. J ^„ . - J 25 FIG. 3 shows the pnonty ranges for threads and thread 

arc providing thread support Threads can provide signifi- ^ 

cant performance gains over sequential process execution. groups. 

By breaking down a process into multiple threads, different FIG- 4 shows the thread/thread group addressing relaUon- 
processors in the system can be opaating on different 

portions of the process at die same time. Applications that FIGS. 5A-^ show the flow of thread group selection and 

can take particular advantage of threads include* for ^ migrati(Hi. 

example, database servers, real-time applications and par- ^ 

allehLg compUers. DESCRIPnON OF THE PREFERRED 

~r7^ ^ , u • EMBODIMENT 

Modem multiprocessmg systems can have eight or morc „ Oven^lew 

Individual processors sharing ptocessing tosks. Many sud, '^Refening to HG. 1. an overview of a multipfocessing 

systemsmcoiporatecachesthatare Aaredbyasubsetofae system 156 is depicted. R>r clarity and ease 

system's processo«. One problem with many pnor art ofp^Sentation an eight process^ystem has been depicted, 

multtprocessor systems, however, is poor processor and a^reciated by ttiose of ordinary skill in 

cache affiaity when a process execuhng on the system ,rt. the invention fa appUcable to multiprocessor systems 

aeates multiple processmg tead^ dunng Us exeamon In ^ of^ors. B isTso not necessary 

some pnorait systems each tl^^is aligned «. mdividua^ ^ ^ ^ ^ ^ 

'^J^ "^^K "^"^ . P«««"» S^P^ »>»ve the same number of processors, 

ttuoughout the syaeja In other pnor art systen^. md.v.dud 1;^^^ ^ .^^^.^ ^^^^^^^^ 1„ 

threads am be affined to mdividual CTUs. but lfa« is no ^^j^ wiU include at least separate data and 

concept ofaffimnggrwipso related threads from the same ^ y- ^ examplTsK bytes of 

process to a groups of CPUs to improve secondary cache ^ ^ instruction and 

mmy whUe improving effiaency of operaUons among ^ «,mponenS^of a«*es 108. «i additional cache 

threadsm the same gr«^ and reducmg ovahead for <^ foTcxample. 1 megabyte of random access 

ttons between fou^-.P^or ^ ^Vf «f '^ol^^ a method ^ ^ ^ ^ pa^of caches 108 in a typical 

'TtT^j;! 'f^^n^f^^r r^'^^^^'S^ system. CPUs 100-103 are connected to secondary cache 

related threads whde mamtammg local efficiency. When ' ^^^^^ 

mulbple related dueads. which tend to access the same dam. ^ ^ ^ connected via main system bus 130 

are distributed acrws mult«)le processor groups an unde- ^ ^ 1J5 

sujly high level of dau swapping m and out of the system ^ other oS^data processing elements not shown. In a 

caches can occur. preferred eight processor embodiment, secondary caches 

SUMMARY OF THE INVENTION ^ ^ch 32 megabytes and shared memwy 120 

is 1 gigabyte of random access memory. Other sizes for each 

The present invenUon relates to a method of operation of ^^^^ memory elements could readily have been 

a multiprocessor data processing system using an enhanced employed. 

meAhod of organizing and scheduling tiireads. ^ Thread Groups 

It is a feature of the invention that processing threads are disclosed herein, a 'thread group" is a set of closely- 

coilcctcd into thread groups. related threads within a process that will tend to access and 

It is another feature of the invention diat each thread operate on the same data. Handling these related threads as 

group has a global pricsity and is schedulable on a global a single globally scbodulable group {H-omotcs a closer rcla- 

basis. 65 tionship t>etween the threads in die group and individual 

It is a further feature of the invention Chat each thread has processors or groups of processors, thereby improving the 

a local priority for scheduling within its thread group. ratio of cache hits and overall system performance. In 



03/17/2004. EAST Version: 1.4.1 



5,745,778 

3 4 

addition to improved processor affioity. additional efflcien- thread te execution, therefore, occurs at two indt^ndent 

des are achieved by having these groups of doscly-related levek: global scheduling of a thread group followed^ local 

threads use, at the group level, a common priority, sched- scheduling of one of that thread group's threads. The pn- 

uling poUcy. tlmesUce and CPU accounting. Since threads in orities of the individual threads within a thread group have 

a thread group will share scheduling resources, thread-to- 5 no bearing on the scheduling of the thread group Jtsdi. 

thread operations within a thread ffoup wiU be faster than wUdi is based solely on the thread group's piionty in the the 

ttircad-lo-thread cmerations that cross group boundaries. thread group structure. . , 

Refiaring to HG. 1. a diagram of a typical process 200 Execution of a process will often involve a ^urallty of 

running on system 150 is showa Process 200 contains thread groups, each with a plurality of threads. The use ot 

thread groups 210. 220 and 230. TG 210 is a real-time (KT) lO thread groups in developing a process gives the user the 

thread and was the initial thread group in jHOcess 200. TG flexihiUty to choose between creating a new thread withinan 

210 has thread group structure (TOS) 2U and three time- existing thread group or created a new tteead groi^- The 

sharing ere) threads 212-214 within its thread group. TG user can make that decision based on the most efficient 

220 has thread group structure 221 and KT threads 222 and approach to handle the various tasks within die process. If. 

223. TG 230 has thread group stnicture 231. RT 232 and TS i$ for aitmfie. a number of threads are being used to wwk on 

233. As win be discussed in more detail below, the numba a parUcular calculation and aU threads will require access to 

appearing within each thread grotq) structure box 211. 221. the same set of data, those threads pr^y bdong in a 

MdMl in FIG. 2 indicates die thread group's global single thread group. On the o&tr hand, if a process is going 

scheduling pdority across the system. The number appear- to initiate a new task wifliin the process that is not dosdy 

ins within eadi thread box indicates the thread's priority M coupled with the task of an existing thread the 

wfthin its particular thread group. threads of the new task will require access to a dtf axnt 

Also conceptually located within process 2«) in FIG. 2, is subset of Data 240. then a new Aread group wmdicated 

the set of ^ 240. access to which potentially wiU be FIG. 4 shows (he structure of the thread and thread ^p 

required by threads during execution of process 200. In FIG. tables mainiained in system memory. A ttowd 8f «>P 

2.daU241represent$thesubsetofdata240thatsupportsthe 23 "^'i ^ '^'^'^ ^^''J^ T ^JtTZ 

task to be ^ifonned by the threads within TG 210. process in the system Thread ^.'"^J^^l^^'^f^^'^l 

Similarly, d^ 242 and 243 support ttie tosks to be per- thread group ID, whidi was assigned ^ the tune the tteead 

S bV the threads within T& and 230. e'°^^'"^"^*^'°^J^^'Z:^^J^'^^J!^,T^ 

A thread group can be aeated as either a realtime group. for (hat thread group. Thread table 402 converts the thread 

sudi as thTMd group 2U or 220. or a timesharing group. 30 ID, whidi was assigned at the time the thread was created, 

such as 230. aid dlher type of thread group can have tea pointer to the thread. Eadi thread entry also contams a 

realtime threads, timesharing threads or a combination pointer to its assodated thread er°"P stm^ Forc^- 

withiniu group. Active threads in aprocess may aeatc one the pointers from threaii jmd H3 to OT/TGS ^^^^ 

or more aSitional threads. When a new thread is created, it the pointers from threads 232 and 233 to TS/TGS 231 are 

can be created within the creating thread's thread group. 35 not *own. 

within another existing thread group, or it can be made the LxK>king again at HG. 1. the various meiiuiry componen^ 

initial thread in a new thread ^wip For example. TG 220 wiOun system 150 can ^ 

lyhavebeenaeatedby athr«dVilhinprocess200orby three processing ^^S'J^JZ^^^^^^^f^^^ 

alhreadinanodiffprocessinthesystem.If8threadisbdng contain one or more "inrtanoes f 

formed as the initiS^thread in a new thread group, the new ao shared memory hierarchy. Uvd » "g^ '!!"1J 

thread group's thread group structure is first created by instances, each mstancc b«ng one of the CPUs 100-107 and 

SdKthre.dgr<^ps{nictnreofthecreatingthread's its assodated caches 108. Uvd 1 conulns two level I 

thread group. Then the new thread is created in the newly instances, eadi comprising four levd 0 mstan^ and a 

aeated ttiread group. TTie creating thread assigns the local secondary cache. Finally, there is a sin^c Levd 2 instance 

sdieduling andpriority io the newly created thread. Unless 45 containing the two levd 1 mstances and the shared system 

otherwise specified, die newly created thread will inherit the memory. .... . t. 

Sche^g pdicy and priority of its creating thread. The design and operation of a mnltiprocessor 

^newly createdto«tad may have a priority that is higher. as system 150 requires the reconediation of two competing 

lower or die same as the priority of the thread that created system goals. On ttie one band, the system designer must 

it Similariy. individual threads within a thread group may so insure that time critical operations are executed in a timely 

have a priority that is higher, lower or the same as the manner. On the other hand, the designer desires to 

S^J ^Its *read group system throughput to the greatert extent possible Referring 

Tlie tiiread group is Ok basic unit of global scheduling to HG. 1. all threads that we "vmble" at L*vel 2 c«a be 

across Ae systfm. The thread group strucSre maintains the taken and executed by any of the eight CPUs lOfr-107. Since 

global sdieiiling poUcy and global priority whidi are used S5 Uiere are dght CPUs tfiat may potentially run Ae threads at 

to sdiedule the ex^tion of Ae A^ead gr«ip. The Aread levd 2. Ais maximizes Ae opportunity of eadhAread to run 

group stnicture also maintains tiie cumulative timesUce and This will, however, result in thr«i^ from Ac same Aread 

CPU accounting for aU tiueads in its thread group, so group bcmg run in ^"fn* CPU groups (i.c.. CTUs 

timesUdng and CPU accounting records for individual 100-101 or CPUs 104-107). System Aroughput suffers 

Areads within a thread group are not necessary. Each 60 because tiireads from multiple thread groups spread across 

individual ttiread within tiie tiiread group maintains Ae CPU groups wiU result in more cache activity. 

Aread priority and scheduling policy for itsdf. At Ae oAer extreme, if aU thread groups were to be 

The rarticular meAod used by a CPU in selecting a tiiread assigned at Levd 0 (i.e.. eadi tiiread group assipied to a 

group to execute is discussed bdow. Once aparticular tiiread specific processor), local cache aflSni^ would be clearly 

ffoup is selected for execution. Ae individual tiiread to be 65 enhanced, since Ae likelihood of cache hits is higher wiAaU 

eiccuted is sdeded based on ttie local priority and sched- tiireads ttiat are working on Ae same sd of date running on 

uling poUcy of Ae tiireads witiiin Ac group. Selection of a Ae same processor. Assigmnent of time critical tiireads to a 
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single processor is not desirable, however, since it increases 
the likelihood that a CPU will become busy and not be able 
to execute all time critical threads on schedule. 

As mentioned above, in a preferred embodiment of sys- 
tem 15#, there art two levels of scheduling. All thread 5 
groups have a group piority and can t>e scheduled globally 
to coii:q)ete far CBV resources available anywhere in system 
150. Once a CPU has selected a thread group to run* the CPU 
will select a thread within the thread group according to the 
local priorities within the thread group. In both situations, lo 
the method of scheduling follows the policies and priorities 
defined in the POSIX P1003.4 standard. 
Scheduling and Priority 

Referring to FIG. 3, the priority distribution table for 
threads and thread groups is shown. Whfle various priodty is 
level schemes could have been employed, in a preferred 
embodiment each thread and thread group will have an 
individual priority represented fay a four digit hcxadedmal 
numbo' ranging from 0000, representing the lowest possible 
priority, to IBFF, representing the highest This range allows 20 
for a total of 7168 possible j^ority leveb. 

In a preferred embodiment, different types of threads and 
thread groups are normally assigned priorities in a limited 
portion of the possible range. Timesharing threads and 
thread groups are typically assigned priorities in the range ai 25 
0400 to 07FP whUe realtime threads and thread groups are 
assigned priorities in the range of 800 to OBFF. A subset of 
all realtime threads are realtime operating system threads, 
such as UNIX kernel demons. Generally, middle priority 
demons, such as Streams in a UNIX system, will be assigned so 
priorities in the range of OCOO to OFFF and high priority 
demons in the range of 1800 up. However, as indicated by 
die arrow in FIG. 4, if the usa considers it necessary, kernel 
demons are allowed to receive a pri(^ty anywhere in the full 
fdodty range. 33 

During execution of processes on the system, threads will 
occasionally acquire a critical kernel resource. To facilitate 
swift freeing of the resource, provisions are made for 
temporarily adjusting the thread's priority levei If a thread, 
other than a kernel demon, is holding a critical kernel 40 
resource, (he thread's priority is adjusted upward and die 
thread is placed in the Level 2 global run queue with the 
thread groups. In a preferred embodiment, the adjustment is 
accomplished by adding bexidecima] COO to the thread's 
initial priority. As shown in FIG. 4, diis adjustment will give 4S 
a thread with a critical resource a higher priority than die 
typical priority range of Streams demons, but not as high as 
the typical priority range of high priority kernel demons. 

Streams and high priority demon threads in the kernel 
direads are scheduled without timeslice monitoring and will so 
run imtil either they are preempted or Ihey relinquish Hxt 
CPU by blocking ot yielding. Tlie priority of these threads 
is not adjusted if a critical kernel resource is held. Realtime 
threads can be scheduled eidter widi or without timeslice 
monitoring. If a realtime thread is scheduled without 55 
timeslice monitoring, it will run until either it is preempted 
by a higher priority thread or it voluntarily relinquishes the 
CPU by blocking or yielding. Hmcsharing threads all 
include timesUcing. All timesharing threads and all realtime 
threads with timeslidng enabled will be taken off the CPU tic 
if the thread group timeslice runs out as well as in die event 
of preemption, yielding ca- blocking. 

Referring again to HG. 2, the priorities of the thread 
groups and threads will be examined. As can be seen, the 
priority of the thread group can be either higher or lower 65 
than the priority of die individual threads within that thread 
group. In the exanq)le of process 200 in FIG. 2, TG 210 has 



6 

been assigned a priority of 900, TG 220 has a priority of 810 
and TG 230 has a priority of 500. If. for example, direads 
groups 210, 220 and 230 happened to be die only thread 
groups active in system 150, then the next available CPU 
would select TG 210 as the source of die next thread to run. 
since TG 210 has the highest priority of the available thread 
groups. 

Onoe TG 210 is selected, the particular thread to be run is 
taken from the local thread group run queue. TG 210's active 
threads 212-214 have priorities of 405, 400 and 405 respec- 
tively. Threads are placed on their run queues in order of 
their priorities. The order of threads of equal priority within 
the run queue is determined by scheduliiig policy. In a 
preferred embodiment if the thread is being scheduled for 
the first time, was taken off die CPU because of timeslice 
runout or, wilh the exception of kernel demons, is awakened 
after being blocked, the thread is placed on the nin queue 
after all odier threads of equal priority. If die diread had been 
preempted, it is placed ahead of other threads of equal 
priority. 
CPU AfflUiity 

As is well understood, it is highly desirable to maximize 
the likelihood that the data needed by a thread is to be found 
in the local cache of die CPU running the du*ead if not 
there, in the secondary cache associated with that CPU's 
CPU group. Trips to the main shared memory to get data not 
located in the caches introduce delay into die processing of 
the dircad and impact overall system throughput At the 
same time, steps taken to increase cache locality cannot 
impact the timely execution of time critical thread opera- 
tions. 

Associated widi each thread group, and available to all 
CPUS in the system, are attributes specifying the thread 
group's allowable CPU or set of CPUs and the dircad 
group's minimum allowed processing level. The CPU 
attribute identifies the specific CPU or set of CPUs in the 
system on which the thread group is allowed to run. 
T^ically, this attribute will identify all CPUs in system 150 
as being allowable* though a subset of system CPUs could 
be specified by die user: The minimum allowed processing 
level attribute specifies the minimum processing level <0. 1 
or 2) at which die thread group may be afiBned. 

The fT^intmiim allowcd processing level for timesharing 
thread groups will typically be 0. This will allow the thread 
group to "migrate" down to either level 1. where the thread 
group will be afllned to a particular group of four CPUs 
sharing a secondary cache at level 1, or to level 0, where the 
thread group will be affined to a particular CPU. AfSning a 
thread groiq) to a specific CPU or single group of CPUs 
improves cache locality for die threads in die thread group. 

The minimum allowed processing level for realtime 
thread groups will typically be Level 2, which will preclude 
the thread groiq) from migrating below die top processing 
levet ensuring thai the thread group will always be available 
to die maximum nimibcr of CPUs. Response time for 
realtime thread groups is. therefore, optimized. The user can 
specify, via the minimum allowed processing attribute, that 
a realtime thread group be allowed to migrate to Level 1 or 
Uvcl 0. 

In a preferred embodiment of the invention, maintaining 
affinity between thread groups and processing instances is 
accomplished by means of the system run queues. In the 
described system, there will be a total of eleven run queues: 
eight level 0 queues (one for each CPU), two level 1 queues, 
and one level 2 queue. Every available thread group will he 
in one. and only one, of diese eleven queues. A newly 
created thread group inherits the run queue of the creating 
thread group as wall as its a£Snity atributes. 
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located that contains an available thread group (step 505). 
CPU 100 stops looking and selects the hi^est priority 
thread group from ttie run queue (step 540). If no thread is 
cmrcntiy running in the available thread group (step 541), 
the thread group is taken away from the Level 0 run queue 
where it currently resides and is pulled over to the CPU 100 
Level 0 run queue (step 544). The thread group will then be 
afiiaod solely wife CPU 100 at level 0. Alternatively, if a 
thread in the available thread group is cuirratly running on 
the odiar CPU at step 541. CPU 100 will move the thread 
group from the other CPU's level 0 run queue to the level 1 
run queue shared by CPUs 100-103 {step 542). The thread 
group is then at processing level 1 and made available to all 
four CPUs in CPU lOO's CPU group. 

Jf CPU 100 has checked the run queues of the other three 
CPUs in its CPU group and still has not located an eligible 
thread group, CPU 100 then begins to check (step 506) the 
other level 1 run queue for eligible thread groups affined to 
the other secondary cache 111. In a preferred embodiment, 



Rcfeoring to FIG. 2, for example, when TG 230 is first 
created it is placed in the level 2 queue. Since TG 230 is a 
timesharing thread group, when one of the CPUs 10<>-107 
eventually selects it to run for the first tiine» TG 230 is 
**pulled down" to fee particular CPU that selected it Ihc 5 
selecting CPU accomplishes this by removing the thread 
group from the level 2 run queue and placing it in its own 
level 0 run queue. That CPU is now affined whh TG 230 and 
wOl continue to run the threads in TG 230 until TG 230 is 
either reaffined or the system Is reset, as discussed below. lO 

Referring to FIGS. 5A-5E, the sequence followed by 
CPU 100 in selecting a thread group to execute will be 
discussed. A similar sequence is followed for aU CPUs in the 
system. When CPU 100 becomes available to execute a 
thread (step 501), it looks (step 502) at its own Uvcl 0 run 15 
queue, the Level 1 run queue for its associated cache 110 and 
the Level 2 run queue. If one or more thread groups arc 
available, CPU 100 proceeds to check for tiie highest 

priority tiircad group (step 519). If the hi^cstpiority level ^ a. ^ 

is shared by thread groups on different run queues, CPU 100 20 aU level I instances would be checked for ebgible threads 
will break the tie (step 521) by selecting the run queue to use before proceeding to check the level 0 run queues of the 
in a round-robin fashion. CPUs in o&er CPU groups. For example, in a 16 processor 

Once a thread group is selected, CPU 100 will begin the embodiment of the invention there would be four secondary 
process of determining if the thread group can be moved caches at level 1. CPU 100 would sequentiaUy check each 
doscr to die CPU. If the minimum aUowed processing level 25 of the otha three secondary caches for an available thread 
attribute of the diread group mandates that the group nuist group befwe proceeding to check the level 0 run queues of 
remain at Level 2 (step 522), CPU 100 i^oceeds to thread the CPUs in other CPU groups. 

execution (step 527), where the highest priority thread in the If an available thread group is located at stqp 507 ^ CPU 
local run queue of the thread group is selertcd. If affinity 100 stops looking and proceeds to step 550. If a thread in me 
below Level 2 is aUowcd, the CPU moves to step 523. 30 selected thread group is currently running on one of the 

CPUs in the other CPU group at step 551, the thread group 
is pulled up to the least common node in the CPU hierarchy 
(step 552). In this case, the least common node is shared 

system memory 120 and, therefwc, the thread group would 

(step 524)lf a^ tfiread in t^^ 35 be puUed up and placed on the level 2 run queue, 

run by another CPU in the system. If not, CPU 100 checks Alternatively, if no thread is currently running in the diread 
(step 525) if Level 0 affinity is aUowed for the thread group. group, the thread group will either be moved to the CPU 100 
If so, the thread group is removed from the run queue where Level 0 run queue (step 556) if Uvel 0 affinity is aUowed 
it currcnUy resides and is placed (step 526) in the CPU 100 (step 555) or, if affinity at Level 0 is not aUowcd f or die 
run queue. If it was dctennined at step 525 that ttic thread 40 thread group, the thread group wiU be added to the Level I 
group cannot be affined at Level 0, as could be the case if the run queue for CPU lOO's CPU group (step 557). 
us€X had chosen to limit the allowable processing level for If CPU 100 has still not located an avaiUble thread groi^ 
that thread group, it is detcnnined (step 532) if the thread CPV 100 then begins to sequentially check CPUs 104-107 
group is curtcntiy in the Uvel 1 run queue. If not, the thread in the other CPU group (step 508). If an eligfele thread group 
group is moved from the Level 2 run queue to the Level 1 45 is located, Oie highest priority thread is selected (step 560). 
run queue (step 533) to increase affinity as mudi as allowed If a thread is currently running, Ae thread group is puUed up 
by the minimum processing level constraint. Returning to to the least ooramon node in the CPU hierarchy. The lea^ 
step 524, if a thread of the thread group is currently being common node in this situation is again shared memory 120 
executed by another CPU, CPU 100 cannot move the thread and, thercfwe, the ttiread group would be puUcd up and 
group into its Level 0 run queue. CPU 100 will still try to 50 entered in the Icvd 2 run queue (step 562). If do thread is 
improve affinity as much as possible. If tfie tiiread group is currentiy running, the thread group is taken away from &e 
in the Level 1 run queue associated with CPU 100 (step otiier CPU and puUcd over to CPU 100* s level 0 queue (step 
529). no closer affinity is presently possible. If the thread 565). 

eroup is not in the Level 1 run queue, but is In tiie Level 2 Finally, if no eUgible ttireads are located anywhere m the 

' ^" 55 system, CPU 100 will run an idle loop (stq) 510) until a 



If the selected thread group is already affined with CPU 
100 (step 523). tiie CPU proceeds to select a thread to 
execute (step 527). If the selected thread group is not 
currentiy in the CPU 100 run queue, CPU 100 determines 



run queue, the thread group is pulled down to the cache 110 
Level 1 run queue (step 531) making the thread group 
affined witii CPU 100 and the other CPUs in ttic CPU 100 
thread group. Finally, if the CPU running a thread in the 
thread grot?) is in tiie otiux CPXJ group (CPU 104-107), die 
thread group must remain affined at Level 2 (step 530). 

If an eligible thread group was not located at step 503. 
CPU 100 next checks (step 504) one-by-one the level 0 run 
queues of the other CPUs in its CPU group. In a preferred 
embodiment with four CPUs per CPU group, CPU 100 wiU 
first check the CPU 101 queue. If no eligible thread is found 
with CPU 101, CPU 100 will check CPU 102*s queue and, 
if necessary, CPU 103* s queue. As soon as a run queue is 



thread group becomes available. From the above 
description, it can be seen that timesharing thread groups can 
migrate up and down through the three inx>ccssing levels of 
the system can, at various times, be afOned with individual 
60 CPUs, witii groups of CPUs or with all CPUs in the systenL 
Load Balancing 

The system described above has an inherent tendency to 
balance the processing load. If the system is in a relatively 
idle period, timesharing thread groups tend to get pulled up 
65 to higher level run queues and have their threads shared by 
multiple CPUs. CPUs that find themselves with a light 
workload will help out busier CPUs and, over time, tend to 
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take over some timesharing thread groups from the busier 
CPUs in the system. Conversely, if the system becomes 
busy:, timeshaiiog thread groups tend to migrate downward. 
This closer affinity between timesharing thread groups and 
CPUs improves cache locality and is desirable. 

In most situations, the timesharing thread groups will 
distribute themselves in a substantially even manner across 
the CPUs. It is theoretically possible, however, that thread 
groups in a busy system may become distributed in an 
unbalanced manner such that some CPUs are busier than 
others causing some thread groups to be executed at a slower 
than desirable rate. 

As a check against operation of the system in an unbal- 
anced condition over a prolonged period of time, the system 
will periodically clear all level 0 and level 1 run queues and 
pull all thread groups back up to level 2. The thread groups 
will immediately begin to again migrate downward. This 
reset function prevents any unbalanced load condition from 
existing for more than a relatively short period of time. In a 
preferred embodiment, this reset function occurs every 10 
seconds, though other time periods may be configured. 

The invention may be in^lemented in other specific 
fonns without departing from the spirit or essential diarac- 
teiistics thereof. For example, while a system having three 
levels of run queues has been discussed above, it will be 
understood that the same concepts can be readily ^(tended 
to systems organized with more than three processing levels. 
The scope of the invention is indicated by the appended 
claims rather than by the fcaegoing description and all 
changes within the meaning and range of equivalency of tiie 
claims are intended to embraced therein. 

I claim: 

1. A data processing system for simultaneously executing 
a plurality of processing tasks, the system comprising: 

a plurality of processors, each processor having first cadte 
means; 

a plurality of second cache means, ead) second cache 
means being connected to a subset of the processors; 

shared memory means connected to each second cache 
means; and 

means for retaining a plurality of run queues, including a 
plurality of Level Oiun queues, each Level 0 run queue 
being associated with one of the processors and con- 
taining the processing tasks cuirentiy a£&ned to its 
associated processor; a plurality of Level 1 run queues, 
each Level 1 run queue being associated with one of the 
subsets of processors and containing the processing 
tasks currentiy affined to its associated subset of 
processors, and a Level 2 run queue associated with all 
processors and containing the processing tasks cur- 
rently affined to all processes in the system, each 
processing task being included in only one of the run 
queues. 

2. The system of claim 1 fitilher comprising means for 
moving at least one processing task among the run queues. 

3. The system of claim 2 wherdn each processing task has 
means for indicating the allowable run queue levels at which 
that processing task may be affined. 

4. The system of claim 1 fiirtiier comprising means for 
identifying when a processor is available to begin execution 
of a processing task, means for selecting a processing task 
to be nin by the identified available processor and means for 
deteimioing if the selected processing task should be moved 
to a nin queue different from the run queue where the 
selected (Koccssing task is currentiy affined. 

5. The system of claim 4 wherein each processing task has 
means foe indicating the allowable run queue levels at which 
that processing task may be affined. 
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6. The system of claim 4 wherein each processing task has 
means for indicating which of the processors are allowed to 
run that processing task. 

7. In a multiprocessor system having a shared memory 
5 accessible to all processors and a plurality of secondary 

cache memories, each secondary cache memory being 
accessible to a subset of the processors, each processor 
having an associated Level 0 run queue containing the 
processing tasks affined with that CPU. each subset of 

10 processors having an associated Level I run queue contain- 
ing the processing tasks afGned with that subset of proces- 
sors and all processors sharing a Level 2 run queue con- 
taining the processing tasks affined with all processors in the 
system, a processing task containing one or more processing 

15 threads: a method of selecting the next processing task to be 
executed by an available processor con^idsing the steps of: 

a) checking the Level 0 run queue of the available 
processor, the Level 1 run queue of the available 
processor's subset of processors and the Level 2 run 

20 queue for available tasks; 

b) if one or mcoe available tasks are located at step a, 
selecting one of the available tasks for execution* 

c) if no available tasks are located at step a. checking the 
Level 0 run queue of one of the other processors in the 

^ the available processor's ptjcessor subset, 

d) if one or more available tasks are located at step c, 
selecting a task from ttic one or more available tasks; 

e) repeating steps c and d fcr each other processor in the 
jQ avaHahle processor's subset; 

f) if no available task is located at steps o-e, checking the 
Level 1 run queue of one of the other processor groups 
in die system; 

g) if one or more available tasks are located at step f, 
33 selecting one of the available tasks for execution; 

h) rq>eating step f and g for each odicr Level 1 run queue 
in the system 

i) if no available task is located at step f-h, checking the 
Level 0 run queue of one the processors in one of the 

^ other processor subsets; 

j) if one or more available task are located at stq> L 

selecting tfie thread group for execution, 
k) rq>eating steps i and j for each fn'ocessor in each other 

processor group in the system, 
1) if no available task is located at steps i-k. running an 

idle loop in the available processor. 

8. The method of claim 7 wherein each processing task 
has an associated run queue indicator indicating the mini- 

50 mum processing level at which the associated processing 
task may be a£&ned and where step b) includes the additional 
steps of: 

1) if the selected task can be affined below Level 2 and tiie 
selected task is not already in the Level 0 run queue of 
ss the availal^e processor and the processing task is not 
cuirentiy being run by another processor and the 
selected task can be affined at Level 0, moving the 
selected task to the Level 0 run queue of the available 
(voces sor; 

60 2) if the selected task can be affined below level 2 and is 
not already in the Level 0 run queue of the available 
processor and is not currentiy being run by anotiier 
processor and cannot t>e affined at Level 0 and is not 
cuirentiy affined at Level 1 . moving the selected task to 

65 the Level 1 run queue of the available processor, i 
3)if tiic selected task can be affined below Level 2 and is 
not already in the Level 0 run queue of the available 
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processor and is cuireatly being run by another pro- least a thread group structure, and wherein at least one 

cessor and is not currently in a Uvcl 1 run queue and thread within a thread group has means for creating a new 

the other processor currenUy running the selected task thread in a new thread group. 

is a mcxJL of the same subset of p^ssors to which IS. THe syst^ of claim 14wherc^ 

the available processor belongs, moving the selected 5 iBtei^ the thread g~"P^^^^*^.^f ^ 

mc BvtuBtuw pivv^ 6 «,„afKu «,v^«cr* coHtaimng the thread which aeated the new thread group. 

tasktothel^vellrunqueucofAeavaih^^^^ 16. n^lsystem of daim 14 wherein the new thread group 

9, The method of clami 7 wherein step d) includes the and scheduUng poUcy by the thread 

addluonal steps of: created the new thread group. 

1) if the other processor associated with the ran queue yj system of daim 14 wherein the new thread group 
checked at step c) is currently running the sdected lo afSncd to the same run queue as tiic run queue of the 
processing task, moving die selected processing task to thread group of the thread which created the new thread 
the Levd 1 run queue of the available processor; group. 

2) if the other processor associated with the run queue 18. The system of daim 14 wherein the new thread group 
checked at step c) is not currently running the sdected is assigned the same scheduling poUcy as ttie scheduling 
processing task, moving the selected iwocessing task to poUcy of the thread group whidi created the new thread 
the Level 0 run queue of the available processor. group. , i.. 

10 The method of daim 7 wherdn step g) includes the 19. A multiprocessor system capable of execuUng a p u- 

ric^ rality of processing threads, die system con^sing: a plu- 

additional st^s of. rality of aocesscrs, memory means connected to and shared 

1) if any processor associated widi die run queue checked by die pluraUty of processes, one or mw processes execut- 
at st^ f) is currcnUy running die selected task, movujg .^^ .^^^^ ^^^^ ^^^^ containing one or mae 
the selected task to die Levd 2 run queue; processing diread groups, each diread group containing at 

2) if no processor assodaied with die run queue checked ^ diread group structure, and wherein die duead group 
at step f) is currendy running ttic selected task and die structure mahitains die scheduling policy and priority for use 
sdected tosk can be affined at Level 0. moving die ^ scheduling die diread group. 

selected task to die Level 0 run queue of die available jO. A multiprocessor system capable of executing a plu- 

processcr; rality of processing direads, the system comprising: a plu- 

3) if no processor assodated widi the run queue checked rality of processors, memoiy means connected to and shared 
at step f) is currendy running the selected task and die by the plurality of processors, one more processes execut- 
selected task cannot be affined at Level 0, moving die 30 ii^g the system, each ja-ocess containing one or marc 
selected task to the Level 1 run queue of die available processing thread groups, each thread group containing at 
processor. least a thread group stnictiure, and wherein the thread group 

11. The method of claim 7 wherdn step i) includes the structure tnain taina the cumulative timeslice inf ormadcHi for 
additional steps of: all threads within that thread group. 

1) if die processor associated widi the run queue checked 35 21. A multiprocessOT system capable of executing a plu- 
at step j) is currendy running die selected task, moving rality of processing dircads, die system contusing: a plu- 
die task to die Level 2 run queue; raUty of processors, memory means connected to and shared 

2) if die processor associated witfi die run queue checked by die plurality of j^occsscts, one or more processes execut- 
at step j) is not currendy running die sdected task, ing in die system, each process contaimng one or mwe 
moving die task to Uie Levd 0 run queue of die 40 processing diread groups, eadi thread group contahung at 
available processor, least a diread group stnictuie. and wherein die diread group 

12. A multiprocessor system a^ble of executing a plu- structure maintains die processor accounting information for 
rality of processing direads, die system conqxrislng: a plu- all direads widiin diat diread group. 

rality of processors, memory means connected to and shared 21 A multiprocessor system c^ble of execudng a p u- 

by die pluraUty of wocesson, one or mwe processes execut- 45 raUty of processing tiurcads, die system; compnsmg: a piu- 

ing in die system, each process containing one w more rality of processors, memory means connected to and shared 

processing diread groups, each tiiread group containing at by die pluraUty of processes, one or more losses execut- 

least a diread group structure, and wherein eadidurcad group ing in die system, each process contaimng one or more 

has an associated priority levd and each diread has an i^ocessing diread groups, eadi diread group contauung at 

associated priority level, die priority levd of each dnrcad 50 least a duead group structure, and wherein die dircad^oup 

being independent of die priority levd of its associated structure maintains a diread group ID and each duead 

thread group. maintains a diread ID. 

13. A muUmrocessor system capable of executing a plu- 23. A multiprocessor system capable of executing a plu- 
rality of processing direads. die system comprising: a plu- raUty of processing du-eads, die system corapnsing: a plu- 
rality ofprocessors, memory means connected to and shared 53 raUty of processors- memory means connected to and shared 
by die pluraUty of proccssOTs, one or more iHocesscs execut- by die pluraUty of processors, one or more processes execut- 
ing in die system, each process containing one or more ing in die system, each process containing one or more 
crocessing diread groups, eadi duead group containing at processing duead groups, each diread group comaining at 
least a t^ead group structure, and wherein at least one least a diread group structure, and wherem each process 
diread widiin a diread group has means for creating a new 60 contains (a) at least one du-cad group table containing 
duead widiin its diread group. pointers to diread group structures in die jEocess and (b) at 

14. A multiprocessor system capable of executing a plu- least one duead table containing pointers to direads widun 
raUty of OToccssing direads, die system comprising: a plu- die diread groups. ^ . . 
raUty of processors, memory means connected to and shared 24. The system of daim 23 wherein each diread maintams 
by die pluraUty ofprocessors, one or more processes execut- 65 a pointer to its assodated diread group suncturc. 

ing in die system, each process containing one or more 25. A multiprocessor system capable of executing a plu- 

processing duead groups, each diread group containing at raUty of processing direads. die system comprising: a plu- 
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rality of processors, memory means connected to and shared 
by the plurality of processors, one or more processes execut- 
ing in the system* each process contaioing one or more 
processing thread groups, each dircad group containing at 
least a thread group structure, and wherein the thread group 
structure contains a list of processors by which &at thread 
group may be executed. 

26. A multiprocessor system capable of executing a plu- 
rality of processing threads, the system comprising: a plu- 
rality of processors, memory means connected to and shared 
by the plurality of processors, one or mare processes execut- 
ing in the system, each process containing one (x more 
processing thread groups, each fiircad group containing at 
least a thread group structure, and wherein the thread group 
structure contains an indicator of dte minimum allowable 
run queue level at which that thread grocqp may be affined. 

27. The system of claim 26 wherein the minirmmi allow- 
able run queue level for realtime ttiread groups is Level 2. 

28. The system of claim 26 wherein the miniimmi allow- 
able run queue level for timeshare thread groups is Level 0. 

29. In a nuiitiprooessor system having a shared memory 
accessible to all processors; a plurality of secondary cache 
memories, each secondary cache memory being accessible 
to a subset of the processors; a plurality of Level 0 run 
queues, each Level 0 run queue being associated with one 
processor and containing the processing tasks affined widi 
that CPU; a plurality of Level 1 run queues, each Level 1 run 
queue being associated with a subset of processors and 
containing the processing tasks affined with that subset of 
processors; a Level 2 run queue associated with all proces- 
sors and containing the processing tasks affijied with all 
processors in die system; and wherein each task is associated 
with a run queue indicator indicating the minimum run 
queue level at which that task may be affined; a method of 
determining if a task should be moved from the run queue 
where the task currently resides to another run queue in the 
system when the task is selected for running by an available 
jK-ocessor. the m^od comprising the steps of: 
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0, moving the task to tiie Level 0 run queue of the 
available processor; 

f) if die selected task is currentty in a Level 1 run queue 
which is not die Level 1 run queue of the available 
processor and the task is not currently being run by 
another pcocessor and the task cannot be afi&ned at 
Level 0, moving the task to tiic Level 1 run queue 
associated with the available processor, 

g) if the selected task is currently in a Level 1 run queue 
which Is not the Level 1 run queue of the available 
processor and the taskis currently being run by anotho- 
processor, moving the task to the Level 2 run queue; 

h) if the sdcctcd task is cuireatiy in the Level 0 run queue 
of another processor and the other processor is cur- 
rently running the selected task and the other processor 
is not a meinber of the same subset of processors to 
which the available processor belongs, moving the 
selected task to the Level 2 run queue; 

i) if the selected task is cufrently in die Level 0 run queue 
of another processor and the other processor is cur- 
rently running the selected task and the other processor 
is a memt)cr of the same subset of processors to wtdch 
the available |Hocessor belongs, moving the selected 
task to die Level 1 run queue of the available processor; 
and 

j) if the selected task is currently in the Level 0 run queue 
of another processor and the other processor is not 
currently running the selected task, moving the selected 
task to the Level 0 run queue of the available processor. 
30. In a multiprocessGT system having a shared memory 
accessible to all processors; a plurality of secondary cache 
memories, each secondary cache memory being accessible 
to a subset of the processors; a plural!^ of Level 0 run 
queues, each Level 0 run queue being associated widi one 
processes and containing the processing tasks affined with 
that CPU; a plurality of Level 1 run queues, each Level 1 run 



queue being associated with a subset of processors and 
V ... J * 1 I i . I ^ containing the processing tasks affined witfi tiiat subset of 

a)ifme selectedtaskis c^^^ processati; a Uvel 2 run queue associated with all proces- 

and the selected task is cunendy bemg run by anodier containing the processing tasks affined with aU 

processors in die system; and wherein each task is associated 
widi a nm queue indicator indicating the minimum run 
queue level at which that task may be affiined; a method of 
deteimining if a task in a Level 0 run queue should be moved 
to another run queue in the system when the task is selected 
for running by an available processor, the method comprise 
ing die st^s of: 

a) if the selected task is not currently being run by another 
processor and the sdectcd task is not in the Level 0 run 
queue of die availatde processor, moving the selected 
task to die Level 0 mn queue of the avaOablc processor; 

b) if the selected task is currently being i\in by another 
processor and die other processor currently running die 
task is a member of the same subset of processors to 
which the available processor belongs, moving die 
selected task to the Level 1 run queue of the available 
processor; and 

c) if the selected task is currcndy being run by another 
processor and die other processor currently running the 
task is not a member of the same subset of processors 
to which the available processor belongs, moving die 
selected task to the Level 2 run queue. 

31. In a multiprocessor system having a shared memory 
accessible to all processors; a plurality of secondary cache 
memories, each secondary cache memory being accessible 
to a subset of the processors; a plurality of Level 0 run 



processor and die odier processor currendy running the 
task is a member of the same subset of processors to 
whidi the available processor belongs and the task can 
t>c affined at Level 1. moving the selected task to (he 
Level 1 run queue of the available processor; 

b) if the selected task is currendy in the Level 2 tun queue 
and the selected task is not cuireoUy being run by 
another processor and the task can be affined at Level 
0. moving die selected task to die Level 0 run queue of 
the available processor; 

c) if the selected task is currendy in the Level 2 run queue 
and the selected task is not currendy being run by 
another processor and the task can be affined at Level 
1 and die task cannot be affined at Level 0, moving die 
selected task to die Level 1 run queue of the available 
processor; 

d) if the selected task is currendy in the Level 1 run queue 
of the available processor and the selected task is not 

' currendy being run by anodier processor and the 
selected task can be affined at Level 0, moving die 
selected task to the Level 0 run queue of the available 
processor; 

e) if the selected task is currendy in a Level 1 run queue 
which is not the Level 1 run queue of the available 
processor and the task is not currently being run by 
another processor and the task can be affined at Level 
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queues, each Level 0 iud queue being associated with one 
processor and containing the processing tasics affined with 
that CPU; a plurality of Level 1 run queues, each Level 1 run 
queue being associated with a subset of prooesscnrs and 
containing the processing tasks affined with that subset of 
processors; a Level 2 ran queue associated with all proces- 
sors and containing the processing tasks affined with all 
processors in the system; and wherein each task is associated 
with a nin queue indicator indicating the miniTnum run 
queue level at which that task may be affined; a method of 
detennining if a task in a Level 1 run queue should be moved 
to another run queue in the system when the task Is selected 
for running by an available processor, the m^od compris- 
ing the steps of: 

a) if the selected task can be affined at Level 0 and is not 
currently being run by another processor, moving the 
selected task to the Level 0 run queue of the available 
t^occssor; 

b) if the selected task cannot be affined at Level 0 and is 
not currendy being run by another processor and is not 
cuircntly in the Level 1 run queue of the available 
processor, moving the selected task to the Level 1 run 
queue of the available processor; and 

c) if the selected task is currently being run by a processor 
that is not a member of the same subset of processors 
to which the available processor t)elongs. moving the 
selected task to the Level 2 run queue. 

32. In a multiprocessor system having a shared nxemory 
accessible to all processes; a plurality of secondary cache 
memories, each secondary cache memory being accessible 
to a subset of the processors; a plurality of Level 0 run 



5,778 

16 

queues, eadi Level 0 run qurae being associated with one 
processor and containing the processing tasks affined with 
that CPU; a plurality of Level 1 run queues, each Lcvd 1 nin 
queue being associated with a subset of processors and 
^ containing the processing tasks affined with that subset of 
processors; a Level 2 mn queue associated with all proces- 
sors and containing the (Mrocessing tasks affined with all 
processors in the system; and wherein each task is associated 
with a run queue indicator indicating the minimum run 
^° queue level at which that task may be affined; a method of 
detomining if a task b the Level 2 run queue should be 
moved to another run queue in the system when the task is 
selected for running by an available processor, the method 
comprising the steps of: 
a) if die selected task can be affined at Level 0 and is not 
currently being run by another processor, moving the 
selected task to the Level 0 ran queue of the available 
processor; 

^ b) if the selected task can be affined at Level 1 and cannot 
be affined at Level 0 and is not currently being run by 
another processor* moving die selected task to die 
Level 1 run queue of the available processor; and 
c) if the selected task can be affined at Level 1 and is 
currently being run by one or more othw processors 
and all o&er im>cessors currently running the task are 
members of the same subset of processors to which the 
available processor belongs, moving the selected task 

^ to the Level 1 run queue of the available processor. 

* » ♦ ♦ ♦ 
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